AITopics | hard data

Collaborating Authors

hard data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 00:18:41 GMT

It could be nice to find a case where the (A,B)-PROD algorithm provides the best results.

algorithm, expert advice, sqrt, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?

Agrawal, Aakriti, Ding, Mucong, Che, Zora, Deng, Chenghao, Satheesh, Anirudh, Langford, John, Huang, Furong

arXiv.org Artificial IntelligenceOct-6-2024

How can we harness the collective capabilities of multiple Large Language Models (LLMs) to create an even more powerful model? This question forms the foundation of our research, where we propose an innovative approach to weak-to-strong (w2s) generalization-a critical problem in AI alignment. Our work introduces an easy-to-hard (e2h) framework for studying the feasibility of w2s generalization, where weak models trained on simpler tasks collaboratively supervise stronger models on more complex tasks. This setup mirrors real-world challenges, where direct human supervision is limited. To achieve this, we develop a novel AdaBoost-inspired ensemble method, demonstrating that an ensemble of weak supervisors can enhance the performance of stronger LLMs across classification and generative tasks on difficult QA datasets. In several cases, our ensemble approach matches the performance of models trained on ground-truth data, establishing a new benchmark for w2s generalization. We observe an improvement of up to 14% over existing baselines and average improvements of 5% and 4% for binary classification and generative tasks, respectively. This research points to a promising direction for enhancing AI through collective supervision, especially in scenarios where labeled data is sparse or insufficient.

adaboost, generalization, weak model, (13 more...)

arXiv.org Artificial Intelligence

2410.04571

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.67)
Government > Military (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones?

Yang, Zhe, Zhang, Yichang, Liu, Tianyu, Yang, Jian, Lin, Junyang, Zhou, Chang, Sui, Zhifang

arXiv.org Artificial IntelligenceJun-18-2024

Large language models (LLMs) have demonstrated impressive capabilities, but still suffer from inconsistency issues (e.g. LLMs can react differently to disturbances like rephrasing or inconsequential order change). In addition to these inconsistencies, we also observe that LLMs, while capable of solving hard problems, can paradoxically fail at easier ones. To evaluate this hard-to-easy inconsistency, we develop the ConsisEval benchmark, where each entry comprises a pair of questions with a strict order of difficulty. Furthermore, we introduce the concept of consistency score to quantitatively measure this inconsistency and analyze the potential for improvement in consistency by relative consistency score. Based on comprehensive experiments across a variety of existing models, we find: (1) GPT-4 achieves the highest consistency score of 92.2\% but is still inconsistent to specific questions due to distraction by redundant information, misinterpretation of questions, etc.; (2) models with stronger capabilities typically exhibit higher consistency, but exceptions also exist; (3) hard data enhances consistency for both fine-tuning and in-context learning. Our data and code will be publicly available on GitHub.

arxiv preprint arxiv, consistency, probability, (14 more...)

arXiv.org Artificial Intelligence

2406.12809

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

Hase, Peter, Bansal, Mohit, Clark, Peter, Wiegreffe, Sarah

arXiv.org Artificial IntelligenceJan-12-2024

How can we train models to perform well on hard test data when hard training data is by definition difficult to label correctly? This question has been termed the scalable oversight problem and has drawn increasing attention as language models have continually improved. In this paper, we present the surprising conclusion that current language models often generalize relatively well from easy to hard data, even performing as well as "oracle" models trained on hard data. We demonstrate this kind of easy-to-hard generalization using simple training methods like in-context learning, linear classifier heads, and QLoRA for seven different measures of datapoint hardness, including six empirically diverse human hardness measures (like grade level) and one model-based measure (loss-based). Furthermore, we show that even if one cares most about model performance on hard data, it can be better to collect and train on easy data rather than hard data, since hard data is generally noisier and costlier to collect. Our experiments use open models up to 70b in size and four publicly available question-answering datasets with questions ranging in difficulty from 3rd grade science questions to college level STEM questions and general-knowledge trivia. We conclude that easy-to-hard generalization in LMs is surprisingly strong for the tasks studied, suggesting the scalable oversight problem may be easier than previously thought. Our code is available at https://github.com/allenai/easy-to-hard-generalization

generalization, hard data, hardness measure, (15 more...)

arXiv.org Artificial Intelligence

2401.06751

Country: North America > United States > New York (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Text Difficulty Study: Do machines behave the same as humans regarding text difficulty?

Chen, Bowen, Ding, Xiao, Du, Li, Bing, Qin, Liu, Ting

arXiv.org Artificial IntelligenceAug-14-2022

Given a task, human learns from easy to hard, whereas the model learns randomly. Undeniably, difficulty insensitive learning leads to great success in NLP, but little attention has been paid to the effect of text difficulty in NLP. In this research, we propose the Human Learning Matching Index (HLM Index) to investigate the effect of text difficulty. Experiment results show: (1) LSTM has more human-like learning behavior than BERT. (2) UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria. (3) Among nine tasks, some tasks' performance is related to text difficulty, whereas some are not. (4) Model trained on easy data performs best in easy and medium data, whereas trains on a hard level only perform well on hard data. (5) Training the model from easy to hard leads to fast convergence.

criteria, difficulty level, text difficulty, (14 more...)

arXiv.org Artificial Intelligence

2208.14509

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Council Post: How Real Estate Investors Can Use Artificial Intelligence

#artificialintelligenceOct-11-2021, 21:30:24 GMT

Helps passive investors grow wealth through real estate. How do you determine whether one multifamily deal is better than another? Let me start by saying that a deal is only as good as the assumptions you're making as an investor, and assumptions are not guaranteed to materialize. As a real estate investor, operator and syndicator through my company Blue Lake Capital, a significant part of our analysis when considering properties to invest in is looking at how the property performed in the past. That includes analyzing various factors that include vacancy rates, bad debt, concessions, income and expenses.

assumption, projection, use artificial intelligence, (10 more...)

#artificialintelligence

Industry:

Banking & Finance > Real Estate (0.82)
Banking & Finance > Trading (0.50)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

How Automation and Artificial Intelligence will Transform HR Processes

#artificialintelligenceMar-8-2019, 02:19:48 GMT

How'human' will HR remain, with things like AI and automation on such a meteoric rise? Technology won't take the "human" out of human resources anytime soon, but tools like process automation and artificial intelligence are shaking up HR departments around the world. Machines are taking on time-consuming administrative tasks, empowering recruiters to focus on people management. As businesses shift toward human capital-centered social enterprises, leveraging modern resources for HR rejuvenation is pivotal. The top players reshaping HR departments are business process automation, robotics, and artificial intelligence.

automation, automation and artificial intelligence, transform hr process, (8 more...)

#artificialintelligence

Country: North America > United States (0.16)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.40)
Information Technology > Artificial Intelligence > Robots (0.39)

Add feedback

A Deep-Learning-Based Geological Parameterization for History Matching Complex Models

Liu, Yimin, Sun, Wenyue, Durlofsky, Louis J.

arXiv.org Machine LearningJul-7-2018

A new low-dimensional parameterization based on principal component analysis (PCA) and convolutional neural networks (CNN) is developed to represent complex geological models. The CNN-PCA method is inspired by recent developments in computer vision using deep learning. CNN-PCA can be viewed as a generalization of an existing optimization-based PCA (O-PCA) method. Both CNN-PCA and O-PCA entail post-processing a PCA model to better honor complex geological features. In CNN-PCA, rather than use a histogram-based regularization as in O-PCA, a new regularization involving a set of metrics for multipoint statistics is introduced. The metrics are based on summary statistics of the nonlinear filter responses of geological models to a pre-trained deep CNN. In addition, in the CNN-PCA formulation presented here, a convolutional neural network is trained as an explicit transform function that can post-process PCA models quickly. CNN-PCA is shown to provide both unconditional and conditional realizations that honor the geological features present in reference SGeMS geostatistical realizations for a binary channelized system. Flow statistics obtained through simulation of random CNN-PCA models closely match results for random SGeMS models for a demanding case in which O-PCA models lead to significant discrepancies. Results for history matching are also presented. In this assessment CNN-PCA is applied with derivative-free optimization, and a subspace randomized maximum likelihood method is used to provide multiple posterior models. Data assimilation and significant uncertainty reduction are achieved for existing wells, and physically reasonable predictions are also obtained for new wells. Finally, the CNN-PCA method is extended to a more complex non-stationary bimodal deltaic fan system, and is shown to provide high-quality realizations for this challenging example.

deep learning, realization, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1807.02716

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback